PodCastle: A Spoken Document Retrieval Service Improved by Anonymous User Contributions
نویسندگان
چکیده
In this invited paper, we introduce a public web service, PodCastle, that provides full-text searching of speech data (Japanese podcasts) on the basis of automatic speech recognition technologies. This is an instance of our research approach, Speech Recognition Research 2.0, which is aimed at providing users with a web service based on Web 2.0 so that they can experience state-of-the-art speech recognition performance, and at promoting speech recognition technologies in cooperation with anonymous users. PodCastle enables users to find podcasts that include a search term, read full texts of their recognition results, and easily correct recognition errors by simply selecting from a list of candidates. Even if a state-of-the-art speech recognizer is used to recognize podcasts on the web, a number of errors will naturally occur. PodCastle therefore encourages users to cooperate by correcting these errors so that those podcasts can be searched more reliably. Furthermore, using the resulting corrections to train the speech recognizer, it implements a mechanism whereby the speech recognition performance is gradually improved. In our experiences from its practical use over the past 46 months (since December, 2006), we confirmed that the performance of PodCastle was improved by a number of anonymous user contributions.
منابع مشابه
PodCastle: Recent Advances of a Spoken Document Retrieval Service Improved by Anonymous User Contributions
In this paper, we introduce recent advances of a speech retrieval web service, PodCastle, that collects and amplifies voluntary contributions by anonymous users. Our goal is to provide users with a public web service based on speech recognition and crowdsourcing so that they can experience state-of-the-art speech recognition performance through a useful service. PodCastle enables users to find ...
متن کاملPodCastle: Collaborative Training of Language Models on the Basis of Wisdom of Crowds
This paper presents a language-model training method for improving automatic transcription of online spoken contents. Unlike previously studied LVCSR tasks such as broadcast news and lectures, large-sized task-specific corpora for training language models cannot be prepared and used in recognition because of the diversity of topics, vocabularies, and speaking styles. To overcome difficulties in...
متن کاملPodCastle and Songle: Crowdsourcing-Based Web Services for Retrieval and Browsing of Speech and Music Content
This paper describes two web services, PodCastle and Songle, that collect voluntary contributions by anonymous users in order to improve the experiences of users listening to speech and music content available on the web. These services use automatic speechrecognition and music-understanding technologies to provide content analysis results, such as full-text speech transcriptions and music scen...
متن کاملEfficient Interactive Retrieval of Spo Ranked by Reinforcem
Unlike written documents, spoken documents are difficult to display on the screen; it is also difficult for users to browse these documents during retrieval. It has been proposed recently to use interactive multi-modal dialogues to help the user navigate through a spoken document archive to retrieve the desired documents. This interaction is based on a topic hierarchy constructed by the key ter...
متن کاملPodcastle: a web 2.0 approach to speech recognition research
In this paper, we describe a public web service, “PodCastle”, that provides full-text searching of Japanese podcasts on the basis of automatic speech recognition. This is an instance of our research approach, “Speech Recognition Research 2.0”, which is aimed at providing users with a web service based on Web 2.0 so that they can experience state-of-the-art speech recognition performance, and at...
متن کامل